Semantic Subgroup Discovery Systems and Workflows in the SDM-Toolkit

نویسندگان

  • Anze Vavpetic
  • Nada Lavrac
چکیده

This paper addresses semantic data mining, a new data mining paradigm in which ontologies are exploited in the process of data mining and knowledge discovery. This paradigm is introduced together with new semantic subgroup discovery systems SDM-search for enriched gene sets (SEGS) and SDM-Aleph. These systems are made publicly available in the new SDM-Toolkit for semantic data mining. The toolkit is implemented in the Orange4WS data mining platform that supports knowledge discovery workflow construction from local and distributed data mining services. On the basis of the experimental evaluation of semantic subgroup discovery systems on two publicly available biomedical datasets, the paper results in a thorough quantitative and qualitative evaluation of SDMSEGS and SDM-Aleph and their comparison with SEGS, a system for enriched gene set discovery from microarray data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Data Mining of Financial News Articles

Subgroup discovery aims at constructing symbolic rules that describe statistically interesting subsets of instances with a chosen property of interest. Semantic subgroup discovery extends standard subgroup discovery approaches by exploiting ontological concepts in rule construction. Compared to previously developed semantic data mining systems SDM-SEGS and SDM-Aleph, this paper presents a gener...

متن کامل

Cluster Based Cross Layer Intelligent Service Discovery for Mobile Ad-Hoc Networks

The ability to discover services in Mobile Ad hoc Network (MANET) is a major prerequisite. Cluster basedcross layer intelligent service discovery for MANET (CBISD) is cluster based architecture, caching ofsemantic details of services and intelligent forwarding using network layer mechanisms. The cluster basedarchitecture using semantic knowledge provides scalability and accuracy. Also, the mini...

متن کامل

A Modeling and Execution Environment for Distributed Scientific Workflows

The Scientific Data Management Center project (short: SDM) is part of a large research program sponsored by the US Department of Energy (DOE) to enable Scientific Discovery through Advanced Computing [SDM02, Sci]. SDM brings together research teams from DOE labs and universities to address and resolve novel data management challenges that arise due to the new data and information centric ways i...

متن کامل

Guided Composition of Tasks with Logical Information Systems - Application to Data Analysis Workflows in Bioinformatics

In a number of domains, particularly in bioinformatics, there is a need for complex data analysis. For that issue, elementary data analysis operations called tasks are composed as workflows. The composition of tasks is however difficult due to the distributed and heterogeneous resources of bioinformatics. This doctorial work will address the composition of tasks using Logical Information System...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Comput. J.

دوره 56  شماره 

صفحات  -

تاریخ انتشار 2013